Topic 8: Exploiting Time Variation
London School of Economics and Political Science
March 2, 2026
Following the same units over time enables new identification and estimation strategies.
\[\Delta y_{it} = \beta \Delta x_{it} + \Delta u_{it}\]
\[y_{it} = \beta x_{it} + \sum_{j=2}^{N} \gamma_j \mathbb{1}[i=j] + u_{it}\]
\[ \begin{aligned} \log(\text{fare})_{it} &= \beta_0 + \beta_1\log(\text{distance})_i + \beta_2\text{competition}_{it} + e_{it} \\ e_{it} &= \gamma_i + \delta_t + u_{it} \end{aligned} \]
\[\begin{align*} \widehat{\log(\text{fare})}_{it} &= \hat{\beta}_{0} + \hat{\beta}_{1}\log(\text{distance})_{i} \\ &+ \hat{\beta}_{2}\text{competition}_{it} \\ &+ \hat{\delta}_{1}\mathbb{1}[t=2007] \\ &+ \hat{\delta}_{2}\mathbb{1}[t=2012] \end{align*}\]
Since \(\text{Cov}(\text{competition}_{it}, \gamma_i) \neq 0\), OLS is biased.
\[ \Delta\log(\text{fare})_{it} = \beta_2\Delta\text{competition}_{it} + \Delta\delta_t + \Delta u_{it} \]
\(\gamma_i - \gamma_i = 0\): time-invariant route characteristics disappear.
With year dummies for transition periods (2002–2007 base, 2007–2012):
\[ \widehat{\Delta\log(\text{fare})}_{it} = \hat\alpha + \hat\beta_2\Delta\text{competition}_{it} + \hat\delta\mathbb{1}[\text{transition } 2007-2012] \]
(1) FD + year dummies, or (2) full LSDV with dummies for both units and time periods.
\[\begin{align*} \log(\text{patents})_{it} &= \beta_0 + \beta_1\log(\text{R\&D})_{it} \\ &+ \beta_2\mathbb{1}[t=2006] + \beta_3\mathbb{1}[t=2007] \\ &+ e_{it} \end{align*}\]
\[\begin{align*} \mathbb{E}[\log(\text{patents})_{it} \mid t=2005] &= \beta_0 + \beta_1\log(\text{R\&D})_{it} \\ \mathbb{E}[\log(\text{patents})_{it} \mid t=2006] &= (\beta_0 + \beta_2) + \beta_1\log(\text{R\&D})_{it} \\ \mathbb{E}[\log(\text{patents})_{it} \mid t=2007] &= (\beta_0 + \beta_3) + \beta_1\log(\text{R\&D})_{it} \end{align*}\]
Year dummies measure growth rates between periods — not “the level in 2006 vs 2005.”
\[ \begin{align*} \log(\text{patents})_{it} &= \beta_0 + \beta_1\log(\text{R\&D})_{it} + \beta_2\mathbb{1}[t=2006] + \beta_3\mathbb{1}[t=2007] \\ &+ \beta_4(\log(\text{R\&D})_{it} \times \mathbb{1}[t=2006]) + \beta_5(\log(\text{R\&D})_{it} \times \mathbb{1}[t=2007]) \\ &+ e_{it} \end{align*} \]
\[\begin{align*} \mathbb{E}[\log(\text{patents})_{it} \mid t=2005] &= \beta_0 + \beta_1\log(\text{R\&D})_{it} \\ \mathbb{E}[\log(\text{patents})_{it} \mid t=2006] &= (\beta_0 + \beta_2) + (\beta_1 + \beta_4)\log(\text{R\&D})_{it} \\ \mathbb{E}[\log(\text{patents})_{it} \mid t=2007] &= (\beta_0 + \beta_3) + (\beta_1 + \beta_5)\log(\text{R\&D})_{it} \end{align*}\]
| Year | Intercept | Elasticity |
|---|---|---|
| 2005 | \(\beta_0\) | \(\beta_1\) |
| 2006 | \(\beta_0 + \beta_2\) | \(\beta_1 + \beta_4\) |
| 2007 | \(\beta_0 + \beta_3\) | \(\beta_1 + \beta_5\) |
No need to interpret \(\beta_4\) and \(\beta_5\) individually; the conditional means do the work.
\[ \begin{align*} \log(\text{wages})_{it} &= \beta_0 + \beta_1\mathbb{1}[i\text{ is male}] + \beta_2 t + \beta_3\mathbb{1}[t \geq 2005] \\ &+ \beta_4(\mathbb{1}[i\text{ is male}] \times t) + \beta_5(\mathbb{1}[i\text{ is male}] \times \mathbb{1}[t \geq 2005]) \\ &+ \beta_6(t \times \mathbb{1}[t \geq 2005]) + \beta_7(t \times \mathbb{1}[t \geq 2005] \times \mathbb{1}[i\text{ is male}]) \\ &+ e_{it} \end{align*} \]
This model captures level differences, trends, and how both changed after 2005, separately for men and women.
\[ \mathbb{E}[\log(\text{wages})_{it} \mid \mathbb{1}[i \text{ is male}]=0,t<2005] = \beta_0 + \beta_2 t \]
\[ \mathbb{E}[\log(\text{wages})_{it} \mid \mathbb{1}[i \text{ is male}]=1,t<2005] = (\beta_0 + \beta_1) + (\beta_2 + \beta_4)t \]
\(\beta_1\) shifts the intercept; \(\beta_4\) shifts the slope.
\[ \mathbb{E}[\log(\text{wages})_{it} \mid \mathbb{1}[i \text{ is male}]=0,\; t \geq 2005] = (\beta_0 + \beta_3) + (\beta_2 + \beta_6)t\]
\[\begin{align*} \mathbb{E}[\log(\text{wages})_{it} \mid \mathbb{1}[i \text{ is male}]=1,\; t \geq 2005] &= (\beta_0 + \beta_1 + \beta_3 + \beta_5) \\ &\quad + (\beta_2 + \beta_4 + \beta_6 + \beta_7)t \end{align*}\]
Each coefficient modifies either the intercept or slope for a specific group-period combination.
\[ e_{it} = \alpha_i + v_{it} \]
\[ \Delta y_i = \delta + \beta_1\Delta x_{i1} + \cdots + \beta_k\Delta x_{ik} + \Delta v_i \]
Example: cannot estimate returns to education via FD if education does not change over time.
Less scope for OVB, but not zero.
\[ \text{education}_{it} = \text{education}^{*}_{it} + e_{it} \]
\[ \log(\text{wage})_i = \alpha + \beta\text{education}^{*}_i + \epsilon_i \]
Observed model substitutes \(\text{education}_i = \text{education}^{*}_i + e_i\)
\[ \text{plim}\;\hat\beta = \beta \cdot \frac{\text{Var}(\text{educ}^*)}{\text{Var}(\text{educ}^*) + \text{Var}(e)} \]
The ratio is less than 1, so the coefficient is biased towards zero (derived in Topic 6).
\[ \Delta\text{education}_i = \Delta\text{education}^{*}_{i} + (e_{i2} - e_{i1}) \]
\[ \text{Var}(e_{i2} - e_{i1}) = \text{Var}(e_{i1}) + \text{Var}(e_{i2}) \]
\[ \text{plim}\;\hat\beta_{\text{FD}} = \beta \cdot \frac{\text{Var}(\Delta\text{educ}^*)}{\text{Var}(\Delta\text{educ}^*) + \text{Var}(e_{i1}) + \text{Var}(e_{i2})} \]
Let \(\text{same}_{it} = \mathbb{1}[i\text{ has same-nationality roommate in } t]\)
\[ \text{grades}_{it} = \alpha + \beta\;\text{same}_{it} + e_{it} \]
\(\text{grades}_{i1} = \alpha + \beta\;\text{same}_{i1} + \alpha_i + u_{i1}\)
\(\text{grades}_{i2} = \alpha + \beta\;\text{same}_{i2} + \alpha_i + u_{i2}\)
\(\Delta\text{grades}_i = \beta\;\Delta\text{same}_i + \Delta u_i\)
Ideal model: \(\text{drug usage}_{it} = \mu\;\text{post}_t + \theta_i + \rho_t + e_{it}\)
Conditional expectations — with \(T = 4\), treatment at \(t = 3\):
\[\begin{align*} \mathbb{E}[\text{drug usage}_{it} \mid t=1] &= \theta_i + \rho_1 \\ \mathbb{E}[\text{drug usage}_{it} \mid t=2] &= \theta_i + \rho_2 \\ \mathbb{E}[\text{drug usage}_{it} \mid t=3] &= \mu + \theta_i + \rho_3 \\ \mathbb{E}[\text{drug usage}_{it} \mid t=4] &= \mu + \theta_i + \rho_4 \end{align*}\]
\[\text{drug usage}_{it} = \mu\text{post}_t + \theta_i + \gamma t + e_{it}\]
| Parametric trend | Day FE | |
|---|---|---|
| Flexibility | Low (linear) | High (any shape) |
| Estimate \(\mu\)? | Yes | No (collinear) |
| Risk | Misspecified trend | No identification |
When treatment varies only at the time level, time FE absorb it completely.
\[ \log(\text{earnings})_{it} = \alpha_i + \theta_t + \sum_{j=17}^{85} \gamma_j\;\mathbb{1}[\text{age}_{it} = j] + e_{it} \]
Each age dummy \(d_j = \mathbb{1}[\text{age}_{it} = j]\) is a binary variable with proportion \(p_j = n_j/n\):
\[ \text{Var}(\hat{\gamma}_j) \propto \frac{\sigma^2}{n \cdot p_j(1 - p_j)} \]
Nonparametric flexibility comes at the cost of imprecision where data is thin.
A firm introduces performance pay. The panel is unbalanced: some workers leave (exiters), some stay (stayers), some join (entrants).
\[ \log(\widehat{\text{productivity}})_{it} = \hat{\alpha}_i + \hat{\beta}\;\text{performance pay}_t \]
OLS captures total change; FE isolates the within-unit mechanism. The difference is the selection channel.
\[ \text{productivity}_{it} = \beta_1\;\text{contingent}_t + \beta_2\;\text{weather}_t + \beta_3\;\text{width}_{it} + \beta_4\;\text{height}_{it} + e_{it} \]
\[\begin{align*} \text{productivity}_{it} &= \beta_1\;\text{contingent}_t + \beta_2\;\text{weather}_t + \beta_3\;\text{width}_{it} + \beta_4\;\text{height}_{it} \\ &+ \gamma_i + e_{it} \end{align*}\]
Controls handle time-varying confounders; FE handles time-invariant heterogeneity.
\[ \text{productivity}_{ijt} = \lambda_i + \mu_j + \theta_t + e_{ijt} \]
Multiple fixed effects require sufficient rotation across dimensions for identification.
For each individual \(i\):
We only observe one potential outcome per unit — the other is the counterfactual
| \(y_{i}(1)\) | \(y_{i}(0)\) | \(\tau_{i}\) | |
|---|---|---|---|
| María | £45,000 | ? | ? |
| Pedro | ? | £30,000 | ? |
We never observe both columns for the same person
When are these the same?
Only if who gets treated is unrelated to the potential outcomes.
Add and subtract \(\mathbb{E}[y(0) \mid T\!=\!1]\) inside the observed difference:
\[\begin{align*} \mathbb{E}[y \mid T\!=\!1] - \mathbb{E}[y \mid T\!=\!0] &= \mathbb{E}[y(1) \mid T\!=\!1] - \textcolor{orange}{\mathbb{E}[y(0) \mid T\!=\!1]} \\ &+ \textcolor{orange}{\mathbb{E}[y(0) \mid T\!=\!1]} - \mathbb{E}[y(0) \mid T\!=\!0] \\ &= \underbrace{\mathbb{E}[y(1) - y(0) \mid T\!=\!1]}_{\text{ATT}} \\ &+ \underbrace{\mathbb{E}[y(0) \mid T\!=\!1] - \mathbb{E}[y(0) \mid T\!=\!0]}_{\text{selection bias}} \end{align*}\]
Research design assumption (randomisation)
\[T_{i} \perp\!\!\!\perp (y_{i}(0),\; y_{i}(1))\]
Selection bias term:
\[\begin{align*} \underbrace{\mathbb{E}[y(0) \mid T\!=\!1]}_{\substack{\text{= } \mathbb{E}[y(0)] \\ \text{by independence}}} - \underbrace{\mathbb{E}[y(0) \mid T\!=\!0]}_{\substack{\text{= } \mathbb{E}[y(0)] \\ \text{by independence}}} &= \mathbb{E}[y(0)] - \mathbb{E}[y(0)] = 0 \end{align*}\]
The observed difference equals the ATT.
\[T_{i} \perp\!\!\!\perp (y_{i}(0),\; y_{i}(1))\]
\[\mathbb{E}[\Delta y_{i}(0) \mid T_{i}\!=\!1] = \mathbb{E}[\Delta y_{i}(0) \mid T_{i}\!=\!0]\]
where \(\Delta y_{i}(0) = y_{i2}(0) - y_{i1}(0)\)
\[\text{ATT} = \mathbb{E}[y_{i}(1) - y_{i}(0) \mid T_{i} = 1]\]
\[\text{profit}_{it} = \alpha + \beta\;\mathbb{1}[\text{simple}_{i}] + \lambda\;\mathbb{1}[t = 2005] + \theta\;(\mathbb{1}[\text{simple}_{i}] \times \mathbb{1}[t = 2005]) + e_{it}\]
\[\begin{align*} \text{force}_{ij} &= \beta_{0} + \beta_{1}\mathbb{1}[i\text{ is Black}] + \beta_{2}\mathbb{1}[j\text{ is Black}] \\ &+ \beta_{3}(\mathbb{1}[i\text{ is Black}] \times \mathbb{1}[j\text{ is Black}]) + e_{ij} \end{align*}\]
where \(i\) = citizen, \(j\) = officer
| White Officer | Black Officer | |
|---|---|---|
| White Citizen | \(\beta_{0}\) | \(\beta_{0} + \beta_{2}\) |
| Black Citizen | \(\beta_{0} + \beta_{1}\) | \(\beta_{0} + \beta_{1} + \beta_{2} + \beta_{3}\) |
\[\text{grades}_{it} = \alpha_{i} + \lambda_{t} + \beta\;(\mathbb{1}[\text{non-DGB}_{i}] \times \mathbb{1}[\text{prohibited}_{t}]) + e_{it}\]
\[\begin{align*} \text{performance}_{it} &= \beta_{0} + \delta_{0}\mathbb{1}[t = 2013] + \beta_{1}\mathbb{1}[\text{even}_{i}] + \delta_{1}(\mathbb{1}[t = 2013] \times \mathbb{1}[\text{even}_{i}]) \\ &+ e_{it} \end{align*}\]
\[\text{performance}_{i} = \eta_{0} + \eta_{1}\;\mathbb{1}[\text{even}_{i}] + e_{i}\]
\[y_{it} = \beta_{0} + \delta_{0}\;\mathbb{1}[t = 2] + \beta_{1}\;\mathbb{1}[\text{Treatment}_{i}] + \delta_{1}\;(\mathbb{1}[t = 2] \times \mathbb{1}[\text{Treatment}_{i}]) + e_{it}\]
| Before (\(t = 1\)) | After (\(t = 2\)) | Difference | |
|---|---|---|---|
| Control | \(\beta_{0}\) | \(\beta_{0} + \delta_{0}\) | \(\delta_{0}\) |
| Treatment | \(\beta_{0} + \beta_{1}\) | \(\beta_{0} + \beta_{1} + \delta_{0} + \delta_{1}\) | \(\delta_{0} + \delta_{1}\) |
| Difference | \(\beta_{1}\) | \(\beta_{1} + \delta_{1}\) | \(\delta_{1}\) |
\[\delta_{1} = \underbrace{(\bar{y}_{T,\text{after}} - \bar{y}_{T,\text{before}})}_{\text{change for treated}} - \underbrace{(\bar{y}_{C,\text{after}} - \bar{y}_{C,\text{before}})}_{\text{change for control}}\]
\[\delta_{1} = \underbrace{(\bar{y}_{T,\text{after}} - \bar{y}_{C,\text{after}})}_{\text{gap after treatment}} - \underbrace{(\bar{y}_{T,\text{before}} - \bar{y}_{C,\text{before}})}_{\text{gap before treatment}}\]
\[\mathbb{E}[y^{0}_{i2} - y^{0}_{i1} \mid T_{i} = 1] = \mathbb{E}[y^{0}_{i2} - y^{0}_{i1} \mid T_{i} = 0]\]
\[y_{it} = \alpha_{i} + \lambda_{t} + \delta\;(T_{i} \times \mathbb{1}[t \geq t_{0}]) + e_{it}\]
\[\text{sickness}_{it} = \beta\;(\Delta\text{years to work}_{i} \times \mathbb{1}[\text{post}]_{t}) + \text{controls}_{it} + \alpha_{i} + \lambda_{t} + e_{it}\]
Write the model for \(t = 1\) and \(t = 2\):
\[\begin{align*} y_{i1} &= \beta_0 + \beta_1 x_{i1,1} + \cdots + \beta_k x_{i1,k} + a_i + v_{i1} \\ y_{i2} &= (\beta_0 + \delta) + \beta_1 x_{i2,1} + \cdots + \beta_k x_{i2,k} + a_i + v_{i2} \end{align*}\]
Subtract: \(a_i - a_i = 0\).
\[ \Delta y_i = \delta + \beta_1\Delta x_{i1} + \beta_2\Delta x_{i2} + \cdots + \beta_k\Delta x_{ik} + \Delta v_i \]
\[ \text{plim}\hat\beta_{\text{CS}} = \beta \cdot \frac{\text{Var}(\text{educ}^*)}{\text{Var}(\text{educ}^*) + \text{Var}(e)} \]
\[ \text{plim}\;\hat\beta_{\text{FD}} = \beta \cdot \frac{\text{Var}(\Delta\text{educ}^*)}{\text{Var}(\Delta\text{educ}^*) + \text{Var}(e_{i1}) + \text{Var}(e_{i2})} \]
\[ \log(\text{earnings})_{it} = \alpha_i + \theta_t + \sum_{j=17}^{85} \gamma_j\;\mathbb{1}[\text{age}_{it} = j] + e_{it} \]
suffers from a fundamental identification problem:
\[ \text{age}_{it} = \text{year}_t - \text{birth year}_i \]
\[\begin{align*} \hat{\beta}_{\text{OLS}} &= \bar{y}_{\text{post}} - \bar{y}_{\text{pre}} \\ &= \left[\frac{|S|}{|S|+|N|}\bar{y}^S_{\text{post}} + \frac{|N|}{|S|+|N|}\bar{y}^N_{\text{post}}\right] \\ &\quad - \left[\frac{|S|}{|S|+|X|}\bar{y}^S_{\text{pre}} + \frac{|X|}{|S|+|X|}\bar{y}^X_{\text{pre}}\right] \end{align*}\]
\[ \hat{\beta}_{\text{FE}} = \frac{1}{|S|}\sum_{i \in S}(y_{i,\text{post}} - y_{i,\text{pre}}) \]
\(|S|\), \(|X|\), \(|N|\) = number of stayers, exiters, entrants.
The call centre exercise (Q12) builds on Abowd et al. (1999) and Fenizia (2022).
\[ \log(\text{wages})_{it} = \alpha_i + \psi_{J(i,t)} + x'_{it}\beta + e_{it} \]
Treatment starts on day 3:
| Day | \(d_2\) | \(d_3\) | \(d_4\) | \(\text{post}_t\) |
|---|---|---|---|---|
| 1 | 0 | 0 | 0 | 0 |
| 2 | 1 | 0 | 0 | 0 |
| 3 | 0 | 1 | 0 | 1 |
| 4 | 0 | 0 | 1 | 1 |
\(\text{post}_t = d_3 + d_4\) — an exact linear combination of the day dummies. Stata would drop one variable automatically. The treatment effect \(\mu\) cannot be separated from the day effects.